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We give (3(log ^-approximation algorithm based on the cut-matching framework of [10, 13, 14] for the comput- 

, ing the sparsest cut on directed graphs. Our algorithm uses only 0(log 2 ri) single commodity max-flow computations 



o 



o 



and thus breaks the multicommodity-fiow barrier for computing the sparsest cut on directed graphs. 



1 Introduction 



The Directed Sparsest Cut problem (henceforth refered to as DSC ) is the following: Given a graph G = (V,E), 
find a partition (S,S) of V which has the minimum directed edge expansion. The directed edge expansion of 
q ' a cut (S,S) is defined as J^^^ 1 . This problem is known to be NP-hard. Interest in this problem and its 

undirectd version derives both from its numerous practical applications such as image segmentation, VLSI layout 
and clustering (see the survey of Shmoys [16]), and from its theoretical connections to spectral methods (see [7]), 
linear/semidefinite programming and metric embeddings. In this paper, we provide fast algorithms for this problem 
using the framework proposed by Khandekar et. al. in [10] (KRV) and subsequently studied by Orecchia et. al. 
[13] and Orecchia et. al. [14] (OSSV). This framework reduces the NP-hard DSC problem to the computation 
of a poly-logarithmic number of single-commodity max-flows while, at the same time, keeping the approximation 
factor poly-logarithmic. 

The Cut-Matching Game framework : KRV and OSVV studied the Cut-Matching game on undirected graphs. At 
the heart of the results of KRV and OSVV lies a two person game, which starts with an empty graph on n vertices 
(assume n is even): in each round, the CutPlayer chooses a bisection (S, S) of the vertices and in response the 
MatchingPlayer chooses a perfect matching between (S,S). The game ends when the (multi)-graph consisting 
of the multi-set union of the perfect matchings has edge expansion at least a, where a is a parameter of the 
game. The goal of the CutPlayer is to minimize the number of rounds of play, while the MatchingPlayer tries to 
draw the game out for as many rounds as possible. Any strategy for the CutPlayer that guarantees termination 
in t rounds yields a (9(?/a)-approximation algorithm for Sparsest Cut. Moreover the resulting algorithm runs 
in time 0(t(T c + 7/)) where T c is the time to implement the cut player and T/ is the running time of a single 
commodity max-flow computation on a n-vertex graph. KRV gave a cut-player strategy that acheives t = <3(log 2 ri) 
and a = Q(l) , thereby obtaining an <9(log 2 «) approximation algorithm for Sparsest Cut problem on undirected 
graphs. OSVV gave a cut-player strategy that acheives t = 0(log 2 ri) and a = Q(log ri) thereby obtaining a <9(log ri) 
approximation algorithm. In both these strategies the running time is dominated by the polylogarithmic number 
of max-flow computations. 

Related Work Arora et. al. [6] gave a 0( ylog /^-approximation algorithm for the undirected sparsest cut problem. 
Their algorithm was based on semidefinite programming. The paper also introduced the framework of expander 
flows. The basic idea here is that the sparsity of the given graph G can be closely approximated by finding an 
expander that can be embedded into G with minimum congestion. Arora et. al. [4] gave an efficient multicom- 
modity flow based implementation using the expander flow framework. They achieved running time 0(n 2 ) 2 for an 



l S°/'(S) = l(u,v) eA|« e S &v # S},6%(S) = {(«, v) £ A\u $ S &v eS] 
2 0(f) denotes 0(f polylog f) 



1 



0( -y/log n) approximation. Breaking the multicommodity flow barrier to get a 0( y"log ri) approximation remained 
an open problem for many years till it was finally resolved by Sherman [15]. They gave an algorithm based on 
Arora et. al.'s framework, but used single-commodity max-flow computations between a polylogarithmic num- 
ber of carefully chosen vertices instead of a multicommodity max-flow computation. On a different direction of 
study, Arora and Kale [5] gave an algorithm that achieves an approximation ratio of 0{\ogn) while still running in 
time dominated by poly-logarithmic single commodity max-flow computations. Their algorithm worked in a more 
general framework for designing primal-dual algorithms for SDPs. 

For DSC , Agarwal et. al. [2] gave a semidefinite programming based 0( ylog ^-approximation algorithm. A 
generalized version of this problem has been studied by Leighton et. al. [11] and by Agarwal et. al. [1]. Fast 
algorithms for single commodity max-flow computations have been studied in [8, 12]. 

This work : We propose a cut-player strategy for cut-matching games on directed graphs. We show that our 
cut-player strategy leads to an 0(log 2 n) approximation algorithm for the DSC . Our algorithm uses only 0(log 2 n) 
single commodity max-flow computations. To the best of our knowledge this is the first study of cut-matching 
games on directed graphs, and the first polylogarithmic approximation algorithm for this problem which breaks 
the multicommodity flow barrier. 

Our CutPlayer strategy is quite similar to that of KRV. Given a set of matchings, their cut player starts with 
a suitably chosen initial distribution on the vertices, and compute the probability distribution that results from 
running a random walk on the graph consisting of the union of the matchings. It then sorts the vertices according 
to their final probabilities and outputs the median cut, say {S,S). Their matching-player computes a cut perfect 
matching across the cut. The key to obtaining an approximation algorithm from a cut-matching game is to ensure 
that the matching player produces a matching that is embeddable in the input graph. In the directed setting, this 
matching-player strategy clearly fails as outputting a directed matching from S to S (or from S to S) clearly will 
not suffice if we want to union of matchings to be a (directed) expander. Trying to mirror the KRV analysis to the 
directed setting, one would want the MatchingPlayer to find a symmetric matching 3 . But the input graph need 
not contain a symmetric matching. We introduce a relaxed notion of perfect matchings in directed graphs and 
show how to compute them using single commodity max-flows. The analysis of our cut-matching game is similar 
to that of KRV and OSVV in that it makes use of a potential function related to the mixing of the walk on the 
current union of matchings: However, bounding the change in potential function is now much trickier. 

2 Review of the Cut-Matching Game framework 

Certifying expansion. To certify that a given graph G has no sparse cut, one could use the expander flow 
formalism of Arora et. al. [6]. This consists of constructing a graph H of known expansion on n vertices and 
embedding it as a flow in G such that the flow routed through each edge in G is at most 1 . This would certify that 
the expansion of H is a lower bound on the expansion of G. The seminal result of Leighton and Rao [11] may 
be viewed as a multi-commodity flow based algorithm to embed the scaled complete graph in any «-vertex graph 
G or produce a sparse cut in G. The core of the algorithms in [6, 4, 10, 5, 14] is based on this expander flow 
formulation. 

The players. The game consisting of 2 players is played on the input graph G = {V,E). The CutPlayer takes 
as input a series of perfect matchings on the vertex set V and out puts a bisection (S,S) of the vertex set. The 
MatchingPlayer takes as input this bisection and either outputs a perfect matching across it which is embeddable 4 
in G or it outputs a sparse cut to prove that no matching across the bisection exists. We formally present the 
algorithm based on the cut-matching game in Figure 1 (We assume that the CutPlayer and the MatchingPlayer 
are oracles. In Section 3 we show how to implement them): 

3 A matching M on a directed graph G is a symmetric matching if (u, v) e M => (v, u) e M. 

4 A matching M on the vertex set V is said to be embeddable in a graph G = (V, E) if G is able to simultaneously support a unit flow from vertex 
u to vertex v V(n, v) € M 
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1 . Input graph G = (V, E), a 


(a) 


t = 0, M = <p 


(b) 


CutPlayer : Let (S , S ) be the cut returned by CutPlayer ({Mi , . . . , M, }) 


(c) 


MatchingPlayer : 




i. If MatchingPlayer (S,S) successfully returns a matching M between S and S then 




A. M,+i :- M 




B. t := t+ 1 




ii. If MatchingPlayer (S,S) returns a cut (C, C) then output (C, C) and End. 


(d) 


If U l ._ jMj is an a/2-expander, then End. 



Figure 1 : The algorithm for sparsest cut based on the cut-matching game 



3 The Algorithm 

We say that a set of edges M is a directed matching on the set of vertices V if \6'^(u)\, \d%f(v)\ < 1 Vu e V. The 
directed matching is a perfect directed matching if \6™(v)\ = \5™'(v)\ = 1 Vu e V. For the rest of the paper, matching 
and perfect matching refer to directed matching and perfect directed matching respectively. 

At a high level our algorithm may be viewed as iteratively building an expander H that embeds in the graph 
G with congestion <9(log 2 n/a). Let {Mi , . . . , M, } be a sequence of perfect matchings. We define a r-step random 
walk associated with this sequence of matchings as follows : in the i th step, the particle stays put with probability 
!/2 and traverses the out-edge from the current location in M, with probability The sequence of matchings 
{Mi M,} is mixing if for any starting position, the probability of the particle reaching any vertex v is atleast l /2n. 
Observe that a mixing sequence of matchings forms a graph with directed edge expanion 1/2 (Lemma A.1). 

As in KRV and OSVV, we use a potential function defined on {Mi, . . . , M t ] to measure how far from uniform 
the resulting distribution of the associated random walk is when starting from a random vertex. Formally, \f/(t) = 
ZyCPyCO-l In) 2 where P, ; (f) is the probability that a particle starting at j reaches i in the random walk associated 
with {Mi, . . . ,M,}. Observe that t^(0) = n - 1 and tfr(t) < l /4n- then the sequence {Mi, . . . ,M,} is mixing. 

The Algorithm in Figure 1 starts with an empty sequence of matchings in G, and while the sequence is not 
mixing, it tries to find a new matching to add to the sequence. Our MatchingPlayer will try to find a matching that 
the matching is embeddable in G with congestion l / a and the CutPlayer will ensure that any matching across the 
cut it outputs will reduce the potential by a factor of (1 - Vi°g«). If the MatchingPlayer succeeds, then 0(log 2 n) 
iterations will suffice to produce a mixing sequence of matchings. In this case, the union of the matchings which 
forms a 1/2-edge expander can be embedded in G with congestion C?((log 2 «)/o'). In case the MatchingPlayer 
does not succeed in finding a matching in some iteration, then the MatchingPlayer outputs a cut in G with 
expansion at most a. 

We present our CutPlayer and MatchingPlayer in Figures 2 and 3 respectively. The CutPlayer runs in O(n) 
and the MatchingPlayer runs in max-flow time. Note that the matching output by the MatchingPlayer is not a 
subset of edges of the input graph G, but a matching that is embeddable in G. We will show that the game in 
Figure 1 using these CutPlayer and MatchingPlayer oracles will prove the following theorem : 

Theorem 3.1. Given a graph G — (V, E) and an a, there exists an algorithm that 

- either outputs a cut of expansion at most a 

- or proves that every cut has expansion at least a/ log 2 n by embedding in G an a-expander with congestion at 
most C9(log 2 n). 
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Morover the algorithm can be implemented using C?(log 2 n) single-commodity max-flow computations. 
By doing a binary search on a we get an 0(\og 2 ^-approximation algorithm for DSC : 

Corollary 3.2. There exists an C?(log 2 n)-approximation algorithm for DSC whose running time is dominated by a 
polylogarithmic number of single-commodity max-flow computations. 

CutPlayer 

Input : sequence of matchings {M\ , . . . , M,} 

1. Choose r, a random unit « -dimensional vector orthogonal to l 5 . 

2. Compute u := (| + + ^f 1 ) • • • (| + ^)r where Mi is the adjacency matrix corresponding to the 
matching M,. 

3. Let S be the set of vertices corresponding to the "/2 smallest values of u 

4. Output (S, S ) 



Figure 2: CutPlayer 

MatchingPlayer 

Input : A bistection (S,S) of the vertex set of graph G = (V, E) 

1. Construct a flow notwork as follows : Assign each edge in G a capacity of 7« (which we assume to be 
integral), add a source node with an outgoing unit-capacity arc to each vertex in S , and add a sink node with 
an incoming unit-capacity arc from each vertex in S . 

2. Compute maximum flow between the source and the sink. 

(a) If the flow-value is "/i, then compute M: a matching between S and S obtained by decomposing the 
flow into flow-paths in a standard manner (See [3]). 

(b) If flow value is less than "/2, find a minimum cut (C, C) separating the source and the sink in the flow 
network and output (C\{source), C\{sink}) and End. 

3. Similiarly, compute a matching M from S to S or output a cut. 

4. Output MUM and End. 



Figure 3: MatchingPlayer 



4 Analysis 

4.1 Analysis of the CutPlayer 

Recall that J 3 ,//) is the probability of going from j to i in the natural random walk associated with {Mi, . . .,M t ). 

def 

We define the vector P,(t) = (Pn(t), . . . ,P,„(f)) to denote the probability of ending up at i starting from each 
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vertex. Observe that the entries in P,(f) sum up to 1. This follows by induction as if e M, then Pj(t) = 
(Pi(t - 1) + Pj(t - l))/2 (see Equation 1) and 1 r .P,-(0) = 1 Vi e V. 

Note that the vector u produced by CutPlayer on M\, . . . , M t is the projection of the vectors P,(t) onto the randomly 
chosen vector r ± 1, i.e., m, = Pi(t).r. The CutPlayer partitions the vertices into two sets according to whether 
the corresponding m,s are large or small. Thus in any matching respecting this partition, if vertices i and j are 
matched, |m, - uj\ will tend to be large. Note that m, - Uj is the projection of P t (t) - Pj(t) on a random vector r. 
Since the vector P,(f) - P,(f) lies in the (n - l)-dimensional space orthogonal to 1 and r is a unit vector chosen 
uniformly at random from this space, we have E[|m,- - uj\ 2 ] = \\Pi(t) - Pj(f)\\ 2 /(n - 1). 
We make use of the following 3 lemmas : 

Lemma 4.1. The reduction in potential for a perfect matching M is i SftneAfll-^iCO ~ Pj{t)\\ 2 
Lemma 4.2. Whp,for all pairs i, j we have ||P,(f) - Pj(t)\\ 2 > c "iog J I M ' ~ M jlli 

Lemma 4.3. For a perfect matching M output by MatchingPlayer across a partition found by CutPlayer , we have 
(n-l)E[2 MeM | M! -w/]>2^) 

The inequality in Lemma 4.2 holds with probability at least 1 - n~ n<1) . Clearly, these three lemmas together imply 
that the expected reduction in the potential in any iteration is (1/logn) fraction of the current potential. This im- 
plies that in 0(\og 2 n) iterations the potential drops below l /4n 2 with high probability and that union of matchings 
forms an expander. We now prove these lemmas. 



Proof, of Lemma 4.1 : For a particle starting at vertex k to reach vertex i in t steps by following the natural random 
walk defined on the the sequence \M\, . . . ,M,} , it can either reach i in t - 1 steps and stay there in the next round 
with probability l /2, or it can reach a vertex j such that (J, i) e M, in t - 1 steps and traverse (j, i) in the t' h step with 
probability l /2. 

Therefore, /?,■&(?) = (ptk(t - 1) + Pij(t - l))/2. Generalizing this, we get 

Pi(t- \) + P:{t- 1) 

Pit) = — — — - (i) 

Recall that if/(t) = YjievWPM ~ l/ n ll 2 - O n adding a perfect matching M across the cut produced by the CutPlayer the 
decrease in potential At/f(f) is 

m) = Ziev\\Pi(i)-Vn\\ 2 -Z ieV \\Pi(t+l)-l/n\\ 2 

= ZievWPM - l/n\\ 2 - 2 ( . ))eM ||^^ - l/ntf 

= Xi^WPif ~ Za.MMW^W 2 (where P i = Ptf) - l/ri) 

Ztevjifilf _ gaaa p j£± 

2 2 



£ 1 ,'j )£ Mll^(0--Pj(0ll 2 



Proof, of Lemma 4.2 : Observe that m, - uj is the projectiong of P,(f) - Pj(t) onto r. The proof for this lemma follows 
from the gaussian behavior of projections and has been proved in [10]. 

□ 

Proof, of Lemma 4.3: Let M and M be the 2 components of M (see MatchingPlayer in Figure 3). Let (S,S) be the 
bisection found by CutPlayer . Recall that S is the set of n/2 vertices i with smallest values of m,. Let n be a real 
number so that m, < 77 < uj for any ; e S and j 6 S . We then have : 
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= Zieviti - nf 

= ZieV u 2 ~ 2l] ZieV Ui + nrf 

= Ziev u) - 2i] Xiev Pi(t).r + nrj 2 (iij = Pi(t).r) 
= iZiev^-l^l.r + nrf (2 lW A(0 = 1) 

= E,w«? -O + nr? (l r .r = 0) 

Similiarly we get that £ (U)e £ l«/ - "/I 2 > 2Lw 

We make use the following well known lemma about the gaussian behavior of projections. 

Lemma 4.4. Ifv is a vector of length I in R d and r is a random unit vector in R d then E[(u r r) 2 ] = l 2 /d 

We use this lemma for vectors (P,(0 — which lie in the (n - l)-dimensional space orthogonal to 1. Note that 
Ui = Pi(t).r = (Pi{t) - l/n).r denotes the projection of (P;(/) - 1/n) onto r. Since r is a random unit vector from the 
space orthogonal to 1, we have E[m 2 ] = ||P;(?) - l/n\\ 2 /(n - 1). 
Hence we have 

E[Z(i,/)eM h ~ K/l 2 ] > 2E[Y jieV u 2 ] 

n-\ 

_ 2 m 
n-l 

□ 

Thus we have shown that (whp) the game will end in C?(log 2 «) rounds. 
4.2 Analysis of MatchingPlayer 

If the MatchingPlayer is not able to find a perfect matching across the input bisection, then it outputs a cut. We 
now prove that that cut has expansion at most a. 

Lemma 4.5. In the procedure MatchingPlayer , if the max flow between the source and the sink is less than n /2 , then 
the cut output by MatchingPlayer has expansion at most a, 

Proof. If the flow has value less than «/2, the minimum cut separating the source and the sink has capacity less than 
"/i. Let the number of edges in the cut incident to the source (resp. sink) be n s (resp. «,). The remaining capacity of 
the cut is less than n ji - n s - n t , and thus uses at most or(«/2 - n„ - n,) edges in the original graph. Moreover, the cut 
consisting of edges in the graph separates at least n ji - n s vertices in source-side from "/2 - n, vertices in sink-side. The 
expansion of this cut is at most a(n/2 - n s - n,)/min(n/2 - n s , n/2 - n,) which is at most a. 

□ 

5 Conclusions 

The techniques introduced in this paper can be used to obtain polylogarithmic approximation algorithms running 
in (single commodity) max-flow time for some more problems on directed graphs like balanced separator problem, 
and some slight generalizations of the sparsest cut pro blem ( the details will appear in the full version of the paper). 

It would be intersting to see if one can obtain a 0( ^log «)-approximation algorithm for DSC problem that runs 
in single commodity max-flow time. 
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A Appendix 



Lemma A.l. A mixing sequence of matchings [M\, . . . , M,} forms a graph with edge expanion 

Proof. Consider the timeline graph H = {V, E') defined on the vertex set V*{0,..., t] with arcs of the form ((/, f), (i, j+ 
1)) and ((i,j),(i',j + 1)) where (i, i') e Mj+\. We also set the capacity of every graph as The random-walk as 
described above starting from a vertex u e V induces a flow in H from (x, 0) to the group {(l,f)> ...,(«, t)}. If the walk 
mixes, then the random walk delivers atleast l /in unit of flow to each (1, t), . . . , (n, t). Using induction it can be shown 
that concurrent unit flows from (1,0), (2, 0), . . . and («, 0) do not violate the edge capacities of H. Observe that each 
arc in the union of the matchings corresponds to exactly 1 arc in H. Thus, mapping the flow in H to the union of 
matchings, we get that the latter is also able to support a concurrent unit-flow from each vertex such that each vertex 
is able to receive at least l /2n unit flow from every other vertex. Therefore, for any cut (S,S), the union of matchings 
support \ s \*("-\s\)/2n > "/ 2 units of flow across the cut (in both directions). Hence, every cut has expansion at least l /2. a 



8 



